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ABSTRACT 

Two order theoretic techniques were presented and 
compared. Ordering theory of Krus and* Bart (1974) and an extended 
Takeya L s item relational structure analysis (IRS) by Tatsuoka and 
Tatsuoka (1981) were used to extract the , hierarchical item structure 
from three datasets. Directed graphs were constructed and both 
methods were assessed as to how well they reproduced the theoretical 
structure of the data. It was discovered ti>at the Krus and Bart 
(1974) procedure more adequately represented the complex 
interrelationships among test data than did the extended IRS method. 
Simulated data were found to present many problems and to be 
inappropriate for research in this area. /Research in this area should 
Juiclutfe a large scale sampling distribution study to determine the 
distribution properties of simulated data. A 'more sophisticated 
method of generating binary responses V^ich accounts for the 
distribution of theta needs to be devel/bped. Also, a. significance 
test and possibly a test of the differences between two item 
characteristic curves should be investigated. ( Author/HFG) 
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Abstract 

Two order theoretic techniques i^re presented and compared. 
Ordering theory of Krus and Bart (1974) and an extended Takeya's item 
relational structure analysis (IRS) by Tatsuoka and Tatsuoka (1981) were 
used to extract the hierarchical item structure from three datasets. 
Directed graphs were constructed and both r^ethods were assessed as to , 
how well they reproduced the theoretical structure of the data. It was 
discovered that the Krus and Bart (1974) procedure more adequately 
represented the complex Interrelationships among test data than did the 
extended IRS method* Simulated data was found to present many problems 
and to be inappropriate for research In this area. 



Intr od uction 

In order to correctly sequence blocks of Instruction it Is 
necessary to discover the underlying relationships between components of 
the instructional unit. Often it is important to uncover the 

i 

hierarchical relationships of procedural tasks and to sequence 
instruction to facilitate learning. Tests can be used to discover this 
relationship. By • assessing the relationships of test items, which 
reflect components of the instructional unit, educators can design and 
modify curricula. We can also check the extent to which we have 
succeeded in const. acting problems that require a hierarchy of skills to 
be solved. 

Methods for analyzing the relationships among items have existed 
for years. These include scalogram analysis (Guttman, 1950; Shevell, 
L975)--and Loevinger's (1947) analysis of item homogeneity. More 
Recently however, methodologies have been developed to extract the best 
fitting hierarchy from test data. 

The purpose of this study is to compare and assess two of these 
procedures, order analysis (Krus & Bart, 1974; Airasian & Bart, 1973) 
and item relation structure analysis (IRS) (Takeya, 1961). Both methods 
will be used to reconstruct a theoretical relationship among fraction 

addition test items. 

Drawing from a combination of psychological measurement theory, 
formal logic theory, information theory, and graph theory concepts, 
order analysis and IRS present a general method of ordering two or more 
i-,ems. Both theories of discovering the hierarchical relations!^ among 
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items can be divided into two components; 1) defining the order 
relation, and 2) extracting the item hierarchies. 

Ordering theory has been developed to study hierarchical test 
structure. The hierarchical structure of a test is defined by a network 
of prerequisite relations among binary items (Bart, 1978). Binary data 
matrices are analyzed with respect to this relationship. The converse 
of the prerequisite relation is the dominance relation. If item i is a 
prerequisite to item j then item j dominates item i. The prerequisite 
or dominance relationhip is of primary interest in ordering theory. 
Briefly, a student is said to dominate an item if he/she passes that 
item, if he/she fails however, he/she is dominated by it. In the same 
manner, item i is a prerequisite to item j if for that student he/she 
answers item i correctly and item j incorrectly. In general, item i is 
said to be a prerequisite to item j if the percentage of students who 
answer item i correctly and item j incorrectly is greater than some 
constant. 

Ordering analysis (Airasian & Bart, 1973; Bart & Krus, 1973) is a 
deterministic measurement model which expands scalogram techniques to 
assess nonlinear task networks. This model utilizes item response 
patterns to extract both linear and nonlinear prerequisite relations 
among tasks (Airasian, Madaus & Woods, 1975). Order analysis uses a set 
of primitive logic to isolate logical order's among variables in a 
hyperspace (Krus, 1978). The basis of an order relation, as defined by 
order analysis, is the characteristic of strong simple orders. Wise 
(1981) explains how strong simple orders have three properties: 
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asymmetry, connectedness, and transitivity. With regard to dominance, 
asymmetry implies that elements i and j cannot simultaneously dominate 
each other. Only one item can dominate the other. Connectedness, on 
the other hand, states that there must be a dominance relationship 
between two items i and j. The definition of transitivity allows 
implied item-item relationships. For elements t, j. and k within an 
order, if i dominates j, and j dominates k, then i dominates k. 

In ordering theory all items must be dichotomously scored. If 
subject k answers item i correctly he/she is given a score of 1, while 
item i is scored 0 if subject k answers it incorrectly. Item i is then 
defined as a prerequisite to item j if the occurence of the response 
pattern (01) for items i , and j is not found. Response patterns (00), 
(10), and (1L), are referred to as confirmatory patterns and the pattern 
(01) is called a disconf irmatory response pattern (Bart & Krus, 1973; 
Airaslan & Bart, 1973). Clearly the (00) and (U) response'patterus do 
not provide any information as to whether item i is a prerequisite to 



item j. 

Theoretically, there should be no inconsistencies of dominance. 
There should be no ij dominances for some students and ji dominances for 
others. However, even with unidimensional items such conflicting 
relations occur in practice due to measurement error. The manner in 
which Item hierarchies are extracted and error in the data is delt with 
differs between the two order theoretic methods. 

Bart and Krus (1973) originally attacked this problem in the 
folLowing manner. For any set of items, a matrix which indicates, the 
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percentage of disconfirraa tory response patterns foe every pair of items 

can be produced. Every cell entry will be the percentage of times that 

a 0 for the 1 th item and a I Eor the j th item occurred. This table of 

percentages can be used to identify item pairs related by a prerequisite 

relationship. If " the percentage of disconf irmatory patterns is less 

than a given tolerance level for any ij pair, then item i can be said to 

be a prerequisite to item j (Bart and Krus, 1973). The tolerence level 

sets the amount of disconf irmatory response patterns which will be 

allowed in defining the prerequisite relation. Finally, when the 

various prequisite relations have been defined, a hierarchy among the 
f 

items can be constructed by applying the transitivity property. The 
hierarchical relationships among the Ltems can be graphically 
represented by use of directed graphs. 

More recently, however, McNemar's (1947) z statistic for comparing 
two correlated frequencies has been applied to analyze the prerequisite 
relations (Bart & Krus, 1973). As before, every element of a matrix is 
assLg'ned a corresponding Zjj value where, 

13 <c+d)* 

where c is the frequency of (10) patterns, and d is the 
frequency of (01) patterns. Again, a prerequisite relation is asserted 
if the percentage of disconf irmatory cases is less than the percentage 
of confirmatory cases. This translates into the condition that the 
corresponding z values exceed a predetermined alpha level. This removes 
chance prerequisite relationships due to measurement error. 
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The Japanese researcher Takeya, starting from the logic of Krus, 
Bart, and Alraslan, has presented * different method ol ordering called 
IRS. As with the Krus and Hart procedure, a binary data matrix Is 
analyzed la terms of prerequisite relationships. Once again, the 
prerequisite relationship between items I and J Is defined as success on 
Item I Is a prerequisite to success on item J. That Is the response 
pattern (OL) for Items I and j respectively, does not occur. As before, 
the problem of the dlsccnf irmatory pattern arises. Here Takeya* 8 
ordering approach departs from the Krus and Bart procedure. 

Takeya (L980a, 198L) considers the statistical independence or 
dependence of scores obtained by two items. We denote a column vector 
of a data matrix X kj by 6j and Its complement by 6j, where 

9. = 1 - a. 

/ 

If the proportion of correct and Incorrect responses is expressed 

by 

■ N 

p(e.) = (i/N) s x. , 
-J k-i K2 



and 



P(9 ) = 1 - P(9.) 



then the proportion of (subjects getting both items I and j correct is 

P(9.,9.) = (1/N)j ^ X k . . 

The proportion of subjects getting item I incorrect and Item j correct 

Is N 

P(9.,9.)= - • 

lu 
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Takeya thus deClnea his coeEt icietit oE ordloallty, r* X j , as: 

r* « 1 - P(e,,9.)/P(e,)P(9 ) . 



Table I reflects this relation. 



Insert Table 1 about here 



An IRS matrix is formed by calculating r ^ for all pairs ^ i arid 
j. If r* L j is larger than a ^constant, the (ij)-celi is replaced by 1, 

otherwise 0. j 

\ 

However, unlike order analysis, Takeya 1 s dominance relation does 
not satisfy the transitivity law. For example, if item i dominates item 
j, and item j dominates item k, item i does not dominate i v tem k unless 

r *ik > c * B y his definition of an order relation,! implied item 

I 

dominances are not allowed. Moreover, Takeya has not discussed an exact 
procedure for extracting the hierarchical relationships among items from 
the IRS matrix. So, Tatsuoka and Tatsuoka (1981) have proposed a 
procedure to extract directed graphs from the IRS matrix which uphold 
the transivity law. It is this modified IKS procedure which will be 
studied in this paper. 

It should be noted that r*jj has a direct relationship to 
Loevinger's R±y Horst (1953) states that is an average 4>/<l> m ax- 
Thus If we define a fourfold contingency table as 



. \ 

ii 



\ 



Tab Lo 1 

(lout tnj'.cMicy Table of LLimim I and J 
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c+d , 



b+d _ p 
N i 



L^oevinger's H^j can be shown to reduce to 



with b > c 

, „ Pd+i) 
and P i/j = P— 



d 
c+d 



ad - be _ (ft, - 
(a+c)(c+d) (() max 



Moreover, by defining r in a similiar manner 



a 


b .: 


a+b 


= P(0 i )N 


for b < c 


• 


c 


d 


c+d 


= P(0.)N 







a+c = P(0..)N b+c = p (£j) N 



Tatsuoka (1981) and Sato: (1981) show that r ij also reduces to 



Thus 



ad - be 



± 



(a+b) (b+d) <f> max 



\ 



r*\*"H. . = — — 



ij\ ij (() max 



Although Loevenger*s work appearedNf irst , was developed in another 

context and not applied to extract ing^hierarchical relationships among 
nonlinear task networks • j For this reasoiT^he^measL 
to. .as Takeya's coefficient of qrdinality. 



ure will be referred 
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The purpose of tnis, paper U to compare these two; order theoretic 
methods and V to assess, which method more accurately extracts a 
theoretical hierarchical '.structure from binary data. . More precisely, 
the order relation defined by\ ordering . theory , and the method of 
extracting item hierarchies utilizing a given tolerance level of 
disc oof icmatory- responses (Bart & Krus, 1973) will be compared .to the 
order relation defined by IRS and the chain extraction method developed 
by Tatsuoka aod4atsuoka (1981) which upholds transitivity. Graphs 
obtained by/he Krus and Bart procedure and the extended IRS will 
compared to the procedural network for fraction addition (Tatsuoka & 
Chevalaz, 1983) to see which best reproduces the theoretical hierarchy 

of fraction addition skills. 

, Me thod 

■ - 1 I 

Test and Subjec ts 

Klein, et al. (1981) described the construction of a 48-item 
fraction-addition test for diagnosing erroneous rules resulting from 
misconceptions occurring at one or more levels of the procedural 
network. Klein and her associates constructed the test to consist of 
two parallel .subtests. Each pair of items was constructed in terms of 
having identical procedural steps. The items reflect a variety of 
skills which are required to corr/ctly add two fractions of varying 
types. Figure 1 Is the procedural network for fraction addition as 
presented in Tatsuoka and Chevalaz (1983). 

Insert Figure 1 about here 



"in an effort to assess and compare" the Krus and Bart procedure and 
the modified IRS, the 48-item fraction test was administered to 148 
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seventh and' eighth grade students. After extensive - logical error 

analysis (Klein, et al . , 1981), and extraction of aundiraensional subset 

of items by GETAB (Baillie, 1980), 36 items were retained for study. 

The estimated a's and b's of the two-parameter logistic model for the 36 
items were calculated by GETAB (Baillie, 1982) along with the means and 
variances and are presented in Tables 2 and 3. 

.Insert Tables 2 & 3 about here 



Datasets / 



Three different datasets, REAL, CLEAN, and SIM1, were employed in 
this study. Dataset REAL contains the binary responses for 148 students 
on 36 items. To avoid contamination by reducing task erors, the 
students' first nonreduced answer was chosen as his/her response. Each 
open-ended . ,ponse was then converted into a decimal number and 
compared to a decimal number answer key. Items were given a value of 1 
if the response and answer key matched and 0 otherwise. With this 

scoring procedure, choice of various common denominators or failure,; to 

i 

reduce answers would not affect scoring. \ 

Klein, et al., (1981) stated that there are two methods of solving 
fraction addition problems. The procedural, network presented /here, 
however, only reflects Method &\ of solving fraction addition problems. 
In this more commonly used method students add the whole Aumber, 
denominator, and numerator parts separately. On the othW hand, 
students who employ Method B first convert all mixed fractions to an 
improper fraction then add and reduce. Dataset CLEAN is a flLset of 
REAL which consists of only those 119 students who used Method A when 
adding fractions. 

16 
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Table 2 

Estimated a and b Values for 36 Fraction Addition Items 
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Table 3 

Means and Variances for 36 ' raction Addition Items (N = 148) 
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A simulated 'dataset, SIM1, was generated following a commonly used 
simulation procedure. 5!irst a pseudorandom number generator yielding a 
normally distributed set with mean 0 and variance 1 was used to simulate 
ability levels for 500 simulees. The probability that a given simulee 
would pass a specif ±c\ item was given by 

P. (9) = 1 



where a and b are the estimated a and b based on REAL 'and 
presented earlier (Lord, 1980). Next a random number between 0 aad 
was generated from a uniform distribution and compared to P^e). It the 

probability of passing the item was greater than the random number, the 

i 

simulated response was given a value of I; conversely, if the 
probability of passing the item was less than the random number the 

simulated response was 0. In this manner 500 simulated response vectors 

/ 
.} 

! - 
of 3b items were generated. 

\ ' 

To test the adequacy of SIM1 reproducing the qualities of REAL,, 
GETAB was used to reestimate the item parameters. It was found, 
however, that the two-paramater logistic model would not; converge for 
the simulated data. Furthermore, traditional item analysis showed that 
SIM1 differed greatly from .REAL. To further ^cok at the plausibility of 
using simulated data, four more simulated daVtasets, SIM2, 81M3, SIM4, 
and.. SIMS, were generated using different random number seeds. I Again, 
the two-parameter logistic model would not converge for these datasets 
The means, variances, and closest estimates of a and b for the 36 items 
and all five datasets are presented in Tables 4 and 5. ^ j 

Insert Tables 4 & 5 about here 
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Table 4 

Mean and Variance of 36 Items for Five Simulated Datasets 
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Table 5 

a and b Values of 36 Items for Five Simulated Datasets 



Item Sim 1 Sim 2 Sim 3 Sim 4 Sim 5 





a 


b 


a 


b 


a 


b 


a 


b 


a 


b 


1 


.237 


-.226 


.272 


.087 " 


.122 


.350 


.165 


-.270 


'.069 


.580 


2 


1.257 


-.017 


1.198 


-.093 


1.067 , 


-.027 ' 


1.019 


-.205 


1.623 


-.056 


3 


7.284 


.504 


8.035 


.491 


7.052 


.535 


6.073 


.615 


7.516 


.404 


4 


3.172 


.681 


3.634 


.635 


3.583 


.625 


2.627 


.759 


3.511 


.580 


5 


,288 


-1.359 


.336 


-1.337 


.410 


-.925 


.296 


-1.514 


.333 


-1.294 


6 


1.47.0 


-.313 


1.618 


-.424 


1.496 


-.268 


1.548 


-.384 


1.590 


-.259 


7 


2.857 


.416 


3.608 


.430 


4.384 


.435 


3.583 


.469 


3.304 


.315 


8 


.551 


-.585 


.592 


-.594 


.533 


-.674 


.428 


-.813 


.637 


-.616 


9 


1.408 


-.653 


1.492 


-,590 


1.723 


-.401 


1.437 


-.565 


1 ,889 


-.651 


10 


2.687 


.310 


2.895 


.297 


3.504 


.400 


2.437 


-.397 


3.321 


.221 


11 


1.086 


-.481 


1.110 


-.576 


1.271 


-.330 


1.159 


-.419 


1.281 


-.454 


12 


3.921 


.576 


4.907 


.494 


5.820 


.520 


3.597 


.656 


6.306 


.398 


13 


1.381 


-.352 


1.119 


-.325 


1.318 


-.240 


1.138 


-.273 


1.052 


-.359 


14 


19.276 


.523 . 


15.287 


.490 


11.039 


.512 


2.004 


.597 


1.958 


.383 


15 


1.629 


-.220 


1.388 


-.332 


1.675 


-.189 


1.097 


-.309 


1.510 


-.303 


16 


1.875 


-.689 


1.336 


-.822 


1.851 


-.587 . 


1.556 


-.771 


2.150 


-.557 


17 


3.672 


.557 


3.750 


.512 


5.079 


.527 " 


3.024 


.648 


3.945 


.401 


18 


'• 2.154 


sr. 221 


1.601 


-.'298 


2.157 


-.129 


1.600 


-.173 • 


1.865 


-.172 


19 


1.142 


. -.224 


.930 


-.4i5 


1.192 


-.235 


'.919 


-.342 


1.276 


-.289 


20 


1.155 


-.315 


1.293 


-.359 


1. 146 


-.248 


1.084 


-.273 


1 . 305 


-.339 


O 1 


5 . 200 


.31/ 


7 £91 


• D 1 1 


IT 1 58 

1 Ji L -JO 


• 52g 


5.035 


.612 


7.416 


.423 


\ 

\22 


4.117 


.621 


3.979 


.626 


6.280 


.592 


4.018 


• .748 


4.836 


.554 


23 


1.360 


-.103 


1.122 


-.128 


1.562 


-.059 


1.392 


-.061 


1.671 


-.118 


24 


1.558 


-.348 


1.541 


-.417 


1.402 


-.333 


1.395 


-.399 . 


1.552 


-.415 


25 


5.735 


.465 


4.433 


.470 


5.28 4 


.431 


4.126 


.521 


5.129 


.329 


26 


1.419 


-.304 


1.354 


-.293 


1.634 


-.129 


1.505 


-.296 


1.648 


-.285 


27 


2. 190 


-.339 


2.296 


-.433 


2.564 


-.248 


1.840 


-.415 


2.831 


-.454 


.28 


13.009 


.521 


10.033 


.540 


2.004 


.541 


7.197 


.631 


10.408 


.443 


29 


2.057 


-.224 


1.746 


-.343 


1.914 


-.175 


'1.617 


-.173 


2.137 


-.195 


30 


3.803 


.614 


6.250 


.571 


5.173 


.567 


3.901 


.727 


4.382 


.521 


31 


1.035 


-.468 


.920 


-.549 


1.016 


-.386 


.865 


-.354 


.908 


-.467 


32 


11.555' 


.487 


6.235 


.502 


7.911 


.506 


6.094 


.587 


12.117 


.416 


33 


1.044 


-.569 


1.083 


-.541 


1.220 


-.367 


.887 


-.573 


1.504 


-.564 


34 


1.656 


-.470 


1.480 


-.544 


1.831 


-.325 


1.257 


-.523 


1.530 


.48^ 


35 


5.833 


.620 


6.047 


.636 


11.861 


.606 


.4.874 


.776 


5.687 


■ .539 


36 


.979 


-.298 


.911 


-.552 


1.267 


-.229 


.973 


-.322 


.990 


-.420 
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Clearly the choice of the random number generator seed or the 
"randomness" has a great effect on the results of the simulation. This 
counter-intuitive result warrants caution in the use of simulated data 
in quantitative research. However, SIM1 , was arbitrarily ' chosen for 
inclusion in this study to determine if simulated . data will reflect the 
hierarchical item patterns in real data, 

r 

Order A nalytic Procedure s 

To determine the capability of the Krus and Bart (1974) and the 

Tatsuoka and Tatsuoka (1981) procedures for extracting item hierarchies, 

all three datasets were analyzed and compared to the procedural network. 

However, only the first subtest of 18 items will be included in the 

analysis. This will aid in the interpretation as the graphs will be 

less complex. The program ORDER2, written by Antonak, Bart, and Lele 

(1979), extracted prerequisite relationships by the Krus and Bart 

procedure, while the modified Takeya analysis was carried out by IRS 

(Baillie & Tatsuoka, 1981). A tolerance level of 5% was chosen for the 

Krus and Bart procedure based on recommendations in the literature 

(Airasian and Bart, 1975; Airasian, et al. , 1975). Based on Takeya's 

" i 
guidelines (Takeya, 1980b) the cutoff for r* Lj was set at .5. 

Re gress ion Analysis 

Finally, a multiple regression analysis was performed to assess 

which, and to what extent, item characteristics influenced item 

difficulty, i.e., students 1 performance. Each item was dichotomously 

scored on 16 charactersit ic variables, such as (1) fraction is of F+F 

type or (3) the denominators are the same. The variables were coded 1 
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if the item possessed that quality and 0 otherwise. Item difficulty, 
Was selected as the criterion , and the 16 . characteristic variables were 
selected as predictors. The 16 characteristic variables are presented 
in Table 6. 



Insert Table 6 about here 



Results 

The outcome of the multiple regression analysis indicates that the 
linear combination of only five item characteristic variables account 
for 87% of the variance in item difficultly. Variables 3,1, 10, 16, 
and 6, had a significant effect on students' performance. Table 7 
presents these results. 



Insert Table 7 about here 

Only these five significant item characteristics will be 
represented in the directed graphs. By following the relationships 
reflected in the graphs between items with similiar and dissimiliar 
characteristics, we can determine the adequacy of the two procedures. 

The directed graphs resulting from 0RDER2 and IKS for dataset REAL 
are presented in Figures 2 and 3, respectively. Figures 4 and 5 are the 
resulting directed graphs. for CLEAN. 

Insert Figures 2, 3, 4 & 5 about here 

Examination of the directed graphs leads to several observations. 
First, graphs obtained by 0RDER2 for the two datasets are considerably 
more complex than those obtained by IRS. . 0RDER2 shows more intricate 
interrelationships among items on the test. Earlier it was shown that 
the two-parameter logistic model converged for dataset REAL satisfying 
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Table 6 

Item Characteristic with Respect to Procedural Skills 



Variabl e Description 

s s_ s x s 2 

S l x S 2 S l + S 2 

2 Mixed w x — + w 2 — , + ^ • 

L l 2 L 2 

3 Denominators are same 

4 One of the denominators is a multiple of the other 

5 Two denominators are relative prime 

6 Two denominators have a common divisor larger than one 

7 + S 2 < L (L is common denominator) 

8 S x + S 2 = L 

9 + is a multiple of L 

10 (S + S 2 )/L is a real number larger than 1 

11 The answer needs reducing j 

12 the answer is a whole number 

13 ' The answer is a mixed number 

14 The fractions in a question can be reduced 

15 One of the numerators is /larger than L (common denominator) 

16 Does second fraction need to be reduced? 

/ , ; : 
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Table 7 

Regression of P.^ on Five Item Characteristics 
with Respect to Procedural Skills 



Multiple „ 




R R 


BETA Weights 




Item Characteristic Number 




3 1 10 16 6 


.937 .878 


.873 .243 -.315 -.335 -.114 
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FIGURE 2: A Directed Graph of Real Data 
from Order 2 
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FIGURE 3.* A Directed Graph of Real Data 
from IRS 
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# DENOMINATORS HAVE COMMON 
PIVISQR_XJ. 1 - — - 
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FIGURE 4! A Directed Graph of Clean Data 
from Order 2 
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the assumption of unidiraensionality . One would expect Ltems of a 
unidimensional test to be highly interrelated. Comparason of Figures 2 
and 3 reveal that by this criterion, 0RDER2 more accurately expresses 

the data than IRS. 

Both methods do a siiniliar job in separating but those items which 
have noncommon denominators from those which have common denominators. 
All graphs show that the procedural skilLs required to successfully 
complete common denominator problems are a prerequisite to the skills 
nee d e d to correctly answer noncommon denominator problems. It is 
interesting to note that the common-no ncommon attribute of an item 
appears to be the most influential aspect in determining students' 
performance. Noncommon denominator problems are not only more 
difficult; by both methods they appear to not be closely interrelated 
(connected in the directed graphs) with common demoninator problems. 
This, moreover, is a reiteration of the results - of the multiple 
regression analysis and lends further- -credence to order analytic 

analysis. ^ 

The multiple regression analysis also demonstrated that the mixed 
fraction (M+M) vs. pure fraction (F+F) distinction was not significant 
in determining item difficulty. One would;" ! priori , have assumed that 
this would be a significant predictor. However, it must be kept in mind 
that the procedural- network reflects only method A of solving fraction 
addition items. Since all parts of the fraction are added separately, 
conversion to an improper fraction is not required, and added procedural 
skills are not needed. In this sense NB-M problems would not be much 
more difficult than F+F problems. Again, graphs from both 0RDER2 and IRS 
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reflect that fact. As discussed earlier, one would then assume if items 
of M+M and F+F type are similiar in nature then there would be many 
relationships or connections between and among these items. Again, 
0RDER2 appears to display this more fully. 

The relationship between items of the type "(S]+S2)/L, is a real 
number greater than 1" is assessed differently by 0RDER2 and IRS. IRS 
graphs for both REAL and CLEAN show a direct relationship between items 
of this type. Items 18, 8, 2, and 10, are all connected in a hierarchy. 
0RDER2 on the other hand, does not show this. Only in Figure 2, are 
two, items of this similiar type related. Clearly,^ 0RDER2 was not able 
to pick up this relationship among the items while IRS was. 

In Figures 2, 3, and 5, items appear that are related to no other 
items by a prerequisite or dominance relation. IRS graphs for botti REAL 
and GLEAN show that item 5 is not clearly dominated by any items nor 
does it dominate any other items. Furthermore, 0RDER2 for REAL 
separated out item 1 from the other items. Intuitively this does not 
make sense; items 5 or 1 must be related to other items. * Thus, these 
items must be of a nature (one that is not reflected in the graphs) such 
that students do not respond to them in any consistent manner. In this 
respect the performance on any other item is totally unrelated to 
performance, on item 5 or 1. It is then a desirable . quality of order 
analysis to separate out items of this nature. 

In the Appendix is a copy of the 36 item test administered to the 
148 students. Upon examining items 5 and 1, no salient characteristic 
appears that would make students respond in such a manner. IRT and 
classical test theory analysis do not flag- , these items. Single item 
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groups and their relationship to the hierarchical structure of the test 
is an unanswered problem in order analysis. 

Finally, the hierarchical relationhips between items in SIM1 are. 
depicted in Figures 6 and 7. By first looking at Figures 2, 3, 4, and 
5, and then at Figures 6 and 7 , it quickly becomes apparent that the. 

Insert Figures 6 & 7 about here 



simulated dataset, SIM1, did not reproduce the hierarchical structure 
among the 18 items. The graphical representation of this data further 
exemplifies the inflated higher mean values presented in Table 4. 
Neither the graph obtained by 0RDER2 (Figure 6) nor that by IRS (Figure 
7) are sirailiar to the graphs obtained by 0RDER2 and IRS for REAL and 
CLEAN. All the interelationships among similiar items extracted by 
0RDER2 have been destroyed. IRS on the other hand, was able to extract 
a structure that is somewhat related to the structure of REAL* 

It was hypothesized that the extreme a values reflected in Table 2 
had a great effect on the ability of these two procedures to reproduce 
the observed data. To test this theory estimated a and b values from 
another dataset, REAL2, were calculated. REAL2 contains the binary 
responses of the 148 students for 36 items scored by a stricter scoring 
procedure. Each item was decomposed into its numerator part, 
denominator part, and whole number part. A response was scored 1 if 
each part of the response matched the three parts of the answer; 
otherwise it was scored 0. It should be noted that this scoring 
procedure necessitates that the student reduce His/her answer to form, 
else his/her response is marked incorrect. Since the procedural network 
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FIGURE 7! A Directed Graph of Simulated Data 
from IRS 
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does not account for reducing the resulting directed graphs of ttBAL.2, 
REAL2 will not reflect the procedural iiel.work.. Estimated a and b values 
for REAL2 were calculated by GETAB (Baillie, 1982) and are presented in 
TabLe 8. The means and variance of the 36 items are presented in Table 9. 

Insert Tables 8 & 9 about here 
These new estLraated a and b values were then used to simulate 500 
response vector. Once again, GETAB (Baillie, 1982) reestimated the item 
parameter of NSIM. The two-parameter logistic model converged for this 
data and the estimated a and b values are shown in Table 9. 

Upon comparing Tables 8 and 10 it becomes quickly apparent that 
NSIM closely replicates the items characteristic of REAL 2 • However, if 
one compares Tables 9 and 11 again, great differences in the item means 
appear. These differences in item difficulties are reflected in 'great 
'differences in the directed graphs. As before, the hierarchical 
structure of the original data is destroyed. Figures 8, 9, and and 11 
display this result. 

Inser t T ables 10 & 11 about here 
Insert Figures 8, 9, 10 & 11 about here 
Clearly, this type of simulated data should be used with great 
caution in quantitative research which assess merits and shortcomings of 
various analyses. It was shown that not only can the choice of randon 
number generator seeds affect the data, but the quality of the original 
a and b values used in the simulation can have great affect on the 
results. Also, simulation data was shown not to maintain the 
hierarchical structure of the original data. The great differences in 
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Table 8 

Estimated a and b Values for 36 Fraction Addition Items 
From REAL2 (N =■ 148) 



I tern 


a 


b 


i 


.848 


.754 


2 


1.594 


.127 


3 


1.935 


. 153 


4 


2.028 


.708 


5 


1.227 


.279 


6 


1.823 


.083 


7 


2.118 


.037 


8 


.962 


.364 


9 


.950 


-1.158 


10 


1.882 


.239 


11 


1.079 


.045 


12 


1.700 


. 135 


13 


.884 


. 161 


14 


2.234 


.009 


15 


1.563 


-.275 


16 


2.042 


-.930 


17 


1.977 


. 156 


18 


1.426 


.173 


19 


1.498 


.210 


. 20 


1.365 


.198 


21 


3.368 


.219 


22 


2.065 


.'708 


23 


1.382 


.348 


24 


1.828 


-.802 


25 


3.999 


-.102 


26 


1.766 


.201 


27 


.971 


-1.240 


28 


2.879 


.428 


29 


1.440 


.205 


30 


1.610 


.,472 


31 


2.101 


.686 


32 


2.490 


.122 


33 


1.573 


-.310 


34 


1.510 


-.644 


35 


1.685 


.422 


36 


1.225 


.155 
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Table 9 

Means and Variance!) for 3<> l-'mrtlon Addition lit* 
1 from RICAI .2 (N « I AH) 





u 

r. .. 


ir 


i 


.257 


.192 


2 


.392 


.240 


3 


.392 


.240 


4 


.250 


.189 


5 


.351 


.229 


6 


.405 


.243 


7 


.419 


.245 


8 


.331 


.223 


9 


.608 


.240 


10 


.372 


.235 


11 


.399 


.241 


12 


.392 


.240 


13 


.372 


.235 


14 


.426 


.246 


15 


.473 


.251 


16 


.595 


.243 


17 


.392 


.240 


18 


.378 


.237 


19 ' 


.372 


.235 


20 


.372 


.235 


21 


.392 


.240 


22 


.250 


.189 


23 


.338 


.225 


24 


.439 


.248 


25 


,453 


.249 


26 


' .378 


.237 


27 


.622 


.'237 


28 


.338 


.225 


29 


.372 


.235 


30 


.311 


.216 


31 


.257 


.192 


32 


.405 


.243 


33 


.480 


.251 


34 


.541 


.250 


35 


.324 


.221 


36 


.378 


.237 
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Tablt! 10 

a and b Values of 36 Items Simulated Datuset NSIM (N » 500) 



Item a ^Jb 



1 


. 718 


i n 1 / 
1.014 


2 


1.463 


. UOD 


3 


2. 128 


1 to 

. 172 


4 


1.960 


. o7o 


5 


1.041 


. 326 


6 


1.706 


.134 


7 


2.042 


. 005 


8 


.711 


.420 


9 


.553 


1 QQ1 

-l.ool 


10 


1 . 697 


o o c 
• ZZO 


1 1 


. 734 


. OA 4 


12 


1.773 


r\ o o 

. OoZ 


13 


.611 


1 OT 

. 12 J 


14 


2. 248 


-.007 


15 


1 . 285 


one 
- . ZUo 


16 


r\ in/ 

2. 194 




17 


2.343 


. iyy 


18 


1 . 223 


in;, 


19 


1 . 596 


i on 

• loj 


20 


1.042 


. 254 


21 


3. 036 


on 
• z iz 


22 


1.785 


.846 


23 


1.191 


.262 


24 


1.560 


-.188 


25 


3.406 


-.141 


26 


1.742 


.147 


27 


.877 


-1.434 


28 


3.000 


.392 


29 


1.030 


.264 


30 


1.369 


.586 


31 


1.394 


.875 


32 


2.398 


.138 


33 


1.349 


-.472 


34 


1.337 


-.701 


35 


1.727 


.461 


36 


.963 


.151 
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Table 11 

Mean and Variance ("or 36 Kract Ion Addition T.tums 
from NSIM (N » 500) 



item 


M 


a 2 


1 


. 308 


.214 


2 


. 484 


. 250 


3 


.454 


.248 


l s 


. 256 


. 191 


5 


.426 


.245 


(y 


. 468 


.249 


7 


. 508 


.250 


ft 


422 


.244 


9 


. 796 


. 163 


10 


. 440 


. 247 


i i 


. 500 


.251 


1 2 


.484 


. 250 


13 


.486 


. 250 


14 


.512 


.250 




. 568 


.246 




. 790 


. 166 


1 7 


444 


. 247 




. *-T *J \J 


. 249 




. *-T J *-T 


248 






247 




.438 


. 247 


22 


. 268 


1.970 




. *-T J U 


. 247 


24 


.566 


.246 


25 


.558 


.246 


26 


.464 


.249 


27 


.802 


. 159 


28 


.378 


.236 


29 


.442 


.247 


30 


.346 


.227 


31 


.276 


.200 


32 


.464 


.249 


33 


.642 


.230 


34 


.700 


.210 


35 


.370 


.234 


36 


.472 


.250 
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FIGURE 8 : A Directed Graph of REAL 2 
from ORDER 2 




□ M + M, M+F, OR F + M 

O f + f 

% NON COMMON DENOMINATOR 

\ (Si +S2)/L IS A REAL 
NUMBER > I 

tt DENOMINATORS HAVE COMMON 
DIVISOR > I 

R FRACTION IN QUESTION 
CAN BE REDUCED 
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FIGURE 9 : - A Directed Graph of REAL 2 
" "~ from IRS " 



LEGEND 
□ M + M, M + F, OR F+M 

O F+F 

^ NON COMMON DENOMINATOR 

\ (S + S )/L IS A REAL 
NUMBER > 1 

tt DENOMINATORS HAVE COMMON 
DIVISOR > 1 . ' 

R FRACTION IN QUESTION 

CAN BE. REDUCED - -~: 
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FIGURE !0: A Directed Graph of NSIM 
from ORDER 2 > 



. LEGEND 
Q M + M, M + F, OR F+M 

O F+F — ~ 
NON COMMON DENOMINATOR 

\ (Si + S2)/L IS A REAL 
NUMBER > 1 

X DENOMINATORS HAVE COMMON 
\ . DIVISOR > i \ 



-F^-FRACTION IN -QUESTION- 
CAN BE REDUCED , 
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FIGURE I I! A Directed Graph of IMS1M 
from IRS 
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LEGEND 

M+M, M + F, OR F + M 




o 


F+ F 






NON COMMON DENOMINATOR 






(Si +S2)/L IS A REAL 
NUMBER > 1 




* 


DENOMINATORS HAVE COMMON 
DIVISOR > 1 - 




R 


FRACTION IN QUESTION 
CAN BE REDUCED 



item means may, however, be 
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caused by the distributions of e. Close 
analysis of the properties of any simulated datasetyts required before 

it is employed in any study. 

Summary and Discussion,, 



Two order analytic approaches to the analysis of test structure 

/ : • ' / ' ' J 



have been presented and described. It was shown that for .unidimensional 
data the Krus and Bart procedure more closely reconstructed the 
procedural network for/fraction addition than the procedure proposed by 
Tatsuoka and TatsuokV based on Takaya's IRS matrix. Thus, when trying 
to discover the relationships of procedural skills and to sequence 
instruction accordingly, this procedure supplies more information about 
the hierarchical structure of tasks. Use of IRS though, appears 1 to be 
more appropriate when' large amounts of error may be in the data. ~ This 
is apparent from its ability to extract a structure from simulated data. 

Clearly, caution is warranted in the use of simulated data in 
quantitative research of the type carried out in- this study. It was 
shown that not only can the means, variances, a, and b values, of 
simulated datasets be greatly affected by the "random" nature of the 
simulation procedure and the original a and b values used as input but 
that- the hierarchical structure of the data is also greatly altered. 
The currently used simulation technique is inadequate in reproducing the 
data when a set of a values which include exaggerated a's is used las the 

1 

basis of the simulation. Furtherm ore, it was shown that this simulation 
technique can greatly alter the item difficulties. This may be due to 
the fact that the distribution of ability Is not .accounted for in the 
population. \ c 
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Research in this .area should -include a large scale sampling 
distribution study to determine the distribution properti.es of simulated 
data. A more sophisticated method of generating binary responses which 
accounts for"" the destribtuion of 6 needs to be ' developed. \,Also, a 
significancl test and possibly a test of the differences between two 
item characteristic curves should be investigated. 
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